Syndromic Surveillance using Generic Medical Entities on Twitter
نویسندگان
چکیده
Public health surveillance is challenging due to difficulties accessing medical data in real-time. We present a novel, effective and computationally inexpensive method for syndromic surveillance using Twitter data. The proposed method uses a regression model on a database previously built using named entity recognition to identify mentions of symptoms, disorders and pharmacological substances over GNIP Decahose Twitter data. The result of our method is compared to the reported weekly flu and Lyme disease rates from the US Center of Disease Control and Prevention (CDC) website. Our method predicts the 2014 CDC reported flu prevalence with 94.9% Spearman correlation using 2012 and 2013 CDC flu statistics as training data, and the CDC Lyme disease rate for July to December 2014 with 89.6% Spearman correlation. It also predicts the prevalences for the same diseases and time periods using the Twitter data from the previous week with 93.31% and 86.9% Spearman correlations respectively.
منابع مشابه
NER for Medical Entities in Twitter using Sequence to Sequence Neural Networks
Social media sites such as Twitter are attractive sources of information due to their combination of accessibility, timeliness and large data volumes. Identification of medical entities in Twitter can support tasks such public health surveillance. We propose an approach to perform annotation of medical entities using a sequence to sequence neural network. Results show that our approach improves...
متن کاملPrevalence of sexually transmitted infections based on syndromic approach and associated factors among Iranian women
Background and purpose: Reproductive and sexual health related problems constitute one third of health problems among women aged 15 to 44 years. Sexually transmitted infections are a significant challenge for human development. We aimed to assess the prevalence of STIs and identify factors associated with among Iranian women. Materials and Methods: Through a cross-sectional study, 399 women ...
متن کاملTwitter mining for fine-grained syndromic surveillance
BACKGROUND Digital traces left on the Internet by web users, if properly aggregated and analyzed, can represent a huge information dataset able to inform syndromic surveillance systems in real time with data collected directly from individuals. Since people use everyday language rather than medical jargon (e.g. runny nose vs. respiratory distress), knowledge of patients' terminology is essentia...
متن کاملAutomated learning of everyday patients' language for medical blogs analytics
Analyzing how people discuss about health-related topics on dedicated forums and social networks such as Twitter, can provide valuable insight for syndromic surveillance and to predict disease outbreaks. In this paper we present a minimally trained algorithm to learn associations between technical and everyday language terms, based on pattern generalization and complete linkage clustering, and ...
متن کاملInvestigating Public Health Surveillance using Twitter
Microblog services such as Twitter are an attractive source of data for public health surveillance, as they avoid the legal and technical obstacles to accessing the more obvious and targeted sources of health information. Only a tiny fraction of tweets may contain useful public health information but in Twitter this is offset by the sheer volume of tweets posted. We present a system which can i...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016